Picture for Jiwon Jeon

Jiwon Jeon

Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR

Add code
May 11, 2026
Viaarxiv icon

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Add code
Mar 25, 2026
Viaarxiv icon

STAIRS-Former: Spatio-Temporal Attention with Interleaved Recursive Structure Transformer for Offline Multi-task Multi-agent Reinforcement Learning

Add code
Mar 12, 2026
Viaarxiv icon